Correlation Clustering in Data Streams
نویسندگان
چکیده
Abstract Clustering is a fundamental tool for analyzing large data sets. A rich body of work has been devoted to designing data-stream algorithms the relevant optimization problems such as k -center, -median, and -means. Such need be both time space efficient. In this paper, we address problem correlation clustering in dynamic stream model. The consists updates edge weights graph on n nodes goal find node-partition that end-points negative-weight edges are typically different clusters whereas positive-weight same cluster. We present polynomial-time, $$O(n\cdot {{\,\mathrm{polylog}\,}}n)$$ O ( n · polylog ) -space approximation natural arise. first develop structures based linear sketches allow “quality” given measured. then combine these with convex programming sampling techniques solve problem. Unfortunately, standard LP SDP formulations not obviously solvable -space. Our presents space-efficient required, well approaches reduce adaptivity sampling.
منابع مشابه
Correlation Clustering in Data Streams
In this paper, we address the problem of correlation clustering in the dynamic data stream model. The stream consists of updates to the edge weights of a graph on n nodes and the goal is to find a node-partition such that the end-points of negative-weight edges are typically in different clusters whereas the end-points of positive-weight edges are typically in the same cluster. We present polyn...
متن کاملClustering Data Streams
W e study clustering under the data stream model of computation where: given a sequence of points, the objective is to maintain a consistently good clustering of the sequence observed so far, using a small amount of memory and time. The data stream model i s relevant to new classes of applications involving massive data sets, such as web click stream analysis and multimedia data analysis. W e g...
متن کاملClustering Geometric Data Streams
Using recent knowledge in data stream clustering we present a modified approach to the facility location problem in the context of geometric data streams. We give insight to the existing algorithm from a less mathematical point of view, focusing on understanding and practical use, namely by computer graphics experts. We propose a modification of the original data stream k-median clustering to s...
متن کاملClustering categorical data streams
The data stream model has been defined for new classes of applications involving massive data being generated at a fast pace. Web click stream analysis and detection of network intrusions are two examples. Cluster analysis on data streams becomes more difficult, because the data objects in a data stream must be accessed in order and can be read only once or few times with limited resources. Rec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Algorithmica
سال: 2021
ISSN: ['1432-0541', '0178-4617']
DOI: https://doi.org/10.1007/s00453-021-00816-9